home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
The Original Shareware 1.1
/
The Original Shareware (WeMake CDs)(Volume 1.1)(CDs, Inc)(1993).iso
/
16
/
fpc225_3.zip
/
F-PCHELP.ZIP
/
ASSEM.TXT
next >
Wrap
Text File
|
1988-09-23
|
20KB
|
393 lines
VIII. PASM, THE F-PC ASSEMBLER
PASM.SEQ is an assembler which is based on an 8086 assembler
published in Dr. Dobb's Journal, February 1982, by Ray Duncan.
This assembler was subsequently modified by Robert L. Smith to
repair bugs, and support the prefix assembler notation. Bob
discovered a very simple method to let a postfix assembler to
assemble prefix code, by deferring assembly until the next
assembler command or the end of line, when all the arguments for
the previous assembler command are piled on the top of the data
stack. Tom Zimmer has made additional modifications to allow
syntax switching, and to increase compatibility in postfix mode
with the F83 Assembler.
Writing assembly programs is black magic. It is not appropriate
to discuss the joy and frustrations in working at such a low
level in this manual. However, F-PC provides the best
environment for you to do experiments using assembly language,
because you can first verify the algorithm and methodology in
high level Forth code and gradually reducing the code to the
assembly level. You will find numerous examples in which the
high level code in F83 is recoded in assembly, in addition to
many of the F83 kernel words which were in assembly already.
The best way to learn 8086 assembly language is to use PASM,
armed with all the code words in F-PC as templates and examples.
Factor your high level words carefully so that words at the
bottom level can be conveniently recoded in assembly. Take the
kernel words as templates to start with, and modify them so that
they will do exactly what you want them to do.
1. PREFIX OR POSTFIX ?
PASM supports dual syntaxes. The words PREFIX and POSTFIX switch
between the two supported modes. The postfix mode is very
similar to F83's CPU8086 Assembler. Prefix mode, which is the
default mode, allows a syntax which is much closer to MASM used
by Intel and MicroSoft.
The assembler supports prefix syntax in an attempt to provide a
syntax which is more readable to programmers of other languages.
The use of sequential text file for source code encourages the
programmer to write programs in the vertical code style with one
statement per line. This style is what traditional assembler
requires. F-PC works well in this style, if you choose to do so.
However, F-PC does not prevent you to write in the horizontal
code style, by which you can squeeze many statements into one
line and make you own life miserable. It supports postfix
syntax to prevent alienating the established base of F83 users.
The prefix notation is close to the original Intel assembly
syntax, and certainly will be more familiar to programmers of
other languages. All the code words defined in F-PC are coded in
the prefix notation. Please consider writing any new assembly
code you need in the prefix mode for distribution and
assimilation.
The assembly of a machine instruction is generally deferred to
the following three events: when the next assembly mnemonic is
encountered, at the end of a line, or when the command END-CODE
or A; is executed. Therefore, a good style in writing code words
in F-PC is to put one assembly instruction in one line, followed
by the parameter specification or the arguments. Multiple
assembly instructions are allowed in the same line, except the
assembly directives which build control structures in a code
word, such as IF, ELSE, THEN, BEGIN, WHILE, AGAIN, etc. These
directives must be the first or the only instruction in a line
because they act immediately, not waiting for the next assembly
instruction. It is a good ideal to put these structure words in
separate lines with proper indentation so that the nested
structures in a code definition can be perceived more readily.
2. PASM GLOSSARY
Here we will only give a small list of of PASM words in this
glossary. All assembly mnemonics are identical to those defined
in F83 8086 Assembler. All the structure directives and test
conditions are also identical to those in F83. Only the most
important FORTH words controlling the assembler are listed here.
PREFIX Assert prefix mode for the following code definitions.
POSTFIX Assert postfix mode for the following code definitions.
CODE Define "name" as a new code definition. Assembly
language follows, terminated by END-CODE.
END-CODE Terminates CODE definitions, checks error conditions,
and makes the code definition available for searching and
execution.
A; Completes the assembly of the previous instruction.
BYTE Assemble current and subsequent code using byte
arguments, if register size is not explicitly specified.
WORD Assemble current and subsequent code using 16 bit
arguments, if register size is not explicitly specified.
LABEL Start an assembly subroutine or mark the current code
address to be referenced later.
3. SYNTAX COMPARISON
The differences among the F-PC prefix mode, the F83 postfix mode,
and the Intel MASM notation are best illustrated by the following
table. Although the table is not exhaustive, it covers most of
the cases useful in doing PASM programming. You are welcome to
suggest additional cases to be included in this table.
PREFIX POSTFIX MASM
AAA AAA AAA
ADC AX, SI SI AX ADC ADC AX,SI
ADC DX, 0 [SI] 0 [SI] DX ADC ADC DX,0[SI]
ADC 2 [BX+SI], DI DI 2 [BX+SI] ADC ADC 2[BX][SI],DI
ADC MEM BX BX MEM #) ADC ADC MEM,BX
ADC AL, # 5 5 # AL ADC ADC AL,5
AND AX, BX BX AX AND AND AX,BX
AND CX, MEM CX MEM #) AND AND CX,MEM
AND DL, # 3 3 # DL AND AND DL,3
CALL NAME NAME #) CALL CALL NAME
CALL FAR [] NAME FAR [] NAME #) CALL ?????
CMP DX, BX BX DX CMP CMP DX,BX
CMP 2 [BP], SI SI 2 [BP] CMP CMP [BP+2],SI
DEC BP BP DEC DEC BP
DEC MEM MEM DEC DEC MEM
DEC 3 [SI] 3 [SI] DEC DEC 3[SI]
DIV CL CL DIV DIV CL
DIV MEM MEM DIV DIV MEM
IN PORT# WORD WORD PORT# IN IN AX,PORT#
IN PORT# PORT# IN IN AL,PORT#
IN AX, DX DX AX IN IN AX,DX
INC MEM BYTE MEM INC INC MEM BYTE
INC MEM WORD MEM #) INC INC MEM WORD
INT 16 16 INT INT 16
JA NAME NAME JA JA NAME
JNBE NAME NAME #) JNBE JNBE NAME
JMP NAME NAME #) JMP JMP
JMP FAR [] NAME NAME [] FAR JMP JMP [NAME]
JMP FAR $F000 $E987 JMP F000:E987
LODSW AX LODS LODS WORD
LODSB AL LODS LODS BYTE
LOOP NAME NAME #) LOOP LOOP NAME
MOV DX, NAME NAME #) DX MOV MOV DX,[NAME]
MOV AX, BX BX AX MOV MOV AX,BX
MOV AH, AL AL AH MOV MOV AH,AL
MOV BP, 0 [BX] 0 [BX] BP MOV MOV BP,0[BX]
MOV ES: BP, SI ES: BP SI MOV MOV ES:BP,SI
MOVSW AX MOVS MOVS WORD
POP DX DX POP POP DX
POPF POPF POPF
PUSH SI SI PUSH PUSH SI
REP REP REP
RET RET RET
ROL AX, # 1 AX ROL ROL AX,1
ROL AX, CL AX CL ROL ROL AX,CL
SHL AX, # 1 AX SHL SHL AX,1
XCHG AX, BP BP AX XCHG XCHG AX,BP
XOR CX, DX DX, CX XOR XOR CX,DX
4. ADDRESSING MODES
The most difficult problem in using 8086 assembler is to figure
out the correct addressing mode and code it into an instruction.
You can get a good ideal and probably figure out most of the
addressing mode syntax from the above table. However, there are
cases the table fells short. Here we will try to summarize the
addressing syntax more systematically to show you how F-PC
handles addresses in the prefix mode.
Register Mode
Source or destination is a register in the CPU. The source
registers are:
AL BL CL DL AH BH CH DH
AX BX CX DX SPECIFICATIONS BP SI DI IP RP CS DS
SS ES
Destination register specifications are:
AL, BL, CL, DL, AH, BH, CH, DH,
AX, BX, CX, DX, SPECIFICATIONS, BP, SI, DI, IP, RP, CS, DS,
SS, ES,
Immediate Mode
The argument is assembled as a literal in the instruction. The
immediate value must be preceded by the symbol #, which is a word
and must be delimited by spaces:
MOV AX, # 1234
ADD CL, # 32
ROL AX, # 3
Direct Mode
An address is assembled into the instruction. This is used to
specify an address to be jumped to or a memory location for data
reference. The address is used directly as a 16 bit number.
Depending on the instruction, the address may be assembled
unmodified or assembled as an eight bit offset in the branch
instructions. To jump or call beyond a 64K byte segment, the
address must be preceded by FAR [] . Examples are:
CALL FAR [] <label>
JMP <dest>
MOV BX, <source>
INC <dest> WORD
JZ <label>
The destination address may be taken from the data stack
directly:
MOV CX, # 16
HERE ( save current code address on stack)
...
...
LOOPZ ( loop back to HERE if condition fails)
Index Mode
One or two registers can be used as index registers to scan
through data arrays. The contents of the index register or the
sum of the contents of two index registers are added to form a
base address, an offset is added to the base address to form the
true address for data reference. Examples are:
CMP 2 [BP], SI
DEC 3 [SI]
MOV BP, 0 [BX]
The following register index specifications are allowed in F-PC:
[SI] [IP] [BP] [RP] [DI] [BX]
[BX+SI] [SI+BX] [BX+IP] [IP+BX] [BX+DI] [DI+BX]
[BP+SI] [SI+BP] [BP+IP] [IP+BP] [RP+IP] [IP+RP]
[BP+DI] [DI+BP] [RP+DI] [DI+BP] [RP+DI] [DI+RP]
There must be an offset number preceding the index register
specification, even if the offset is 0. When the index register
is used as destination, a comma must be appended immediately:
MOV 0 [BX+IP], AX
Implied Mode and Segment Override
The implied mode is where mistakes are most likely to occur
because you will have to be keenly aware of which segment
register is used by the instruction at any instance. Since the
segment register is implied and not stated explicitly, the bug
generally can hide very securely underneath laughing at you. The
code works when you test it but fails when the segment register
is modified.
Branch and jump instructions use CS segment register.
Data movement instructions use DS segment register.
Stack instructions use SS segment.
String instructions use DS:SI as source and ES:DI as destination.
If you need to specify an address with a segment register other
than the default implied register, use a segment override
instruction before the address specification:
CS: DS: ES: SS:
Examples are:
MOV ES: BP, SI
CMP CS: 2 [BP], AX
ADD AX, ES: 10 [BX+DI]
The 8086 addressing modes are so confusing that even experienced
programmer needs a good Intel 8086 manual to find the right
addressing mode and the F-PC assembler syntax table to determine
the correct argument list.
The best way to write assembly code is still keeping the code
short and simple. It is very easy in F-PC to break a long CODE
definition into many small fragments which are initially defined
as separate CODE definitions. After verifying that each fragment
works, you can edit out the CODE, NEXT, and END-CODE lines to
combine the fragments into a single CODE definition.
Charles Curley kindly contributes an 8086 disassembler with a
single step debugger. It is helpful to disassemble the CODE word
you defined and see what the computer thinks of what you mean.
There is always this 'Do what I mean, not what I say' syndrome.
Stepping through a piece of code one instruction at a time is the
last thing you have to do if everything else failed.
5. MACROS IN PASM
Another area of interest is the macros, here is the definition of
the NEXT macro:
: NEXT >PRE JMP >NEXT A; PRE> ;
The macro itself is simply the sequence JMP >NEXT. The
surrounding words are used for support. Since PASM supports both
postfix as well as prefix notation, it is not known on entry to a
macro what mode is selected. The words >PRE and PRE> select
prefix, and restore the previous mode so macros will always be in
prefix notation. The A; after >NEXT, forces the assembly of the
JMP instruction before the mode switch.
You can find many other examples of assembly macros in PASM.SEQ,
like 1PUSH, 2PUSH, and all the structure building directives.
6. LOCAL LABEL
To support large code definitions, Bob Smith introduced 'local
labels' to F-PC. The local lables are place markers $: preceded
by a number. They are used to mark locations in a large code
definition for forward and backward jumps and branches. They can
be used quite freely in a range of code words and reused to save
head space by replacing LABELs which have global names and cannot
be reused.
The use of local labels is best demonstrated by an example taken
from the software floating point package SFLOAT.SEQ by Bob Smith,
shown in Figure 21. Up to 32 local labels can be used to mark
addresses of assembly code. They can be refered to before or
after their placements. They can be referenced across code word
boundaries. The command CLEAR-LABELS defines the boundary where
the local label referencing cannot cross. Between two
consecutive CLEAR-LABELS, local labels can be freely placed and
referenced.
This technique is especially useful where the one-entry-one-exit
dogma is very awkward when a piece of code has multiple entry
points and can be shared among many code word definitions. It
allows us to construct structured spaghetti code, if there were
such thing.
7. INLINE CODE
INLINE allows us to include machine code inside a high level
colon definition. This is easily done in F-PC because it is
built on direct threaded code. Every word is compiled as a code
address in the colon definition. The code in the code field
pointed to by the code address is executed directly because it is
genuine 8086 machine code. Whether the code belongs to a colon
definition or a code definition does not make any difference.
INLINE only has to compile the address pointing to the top of the
dictionary in the code segment. The assembler can then be
invoked to compile machine code. If the code is terminated by
NEXT or one of its derivatives, the next word compiled in the
colon definition will be executed after the assembly code is
done. END-INLINE only has to clean up the assembly environment
and return the control back to the colon compiler.
Here is an example on how to use INLINE and END-INLINE to add
assembly code in the middle of a colon definition:
: TEST ( -- )
5 0 DO
I \ Get loop index
INLINE
pop ax \ pop I
add ax, # 23 \ add 23
1push \ push sum
END-INLINE
. \ print results
LOOP
;